Untrimmed Video Classification for Activity Detection: submission to ActivityNet Challenge

نویسندگان

  • Gurkirt Singh
  • Fabio Cuzzolin
چکیده

Current state-of-the-art human activity recognition is focused on the classification of temporally trimmed videos in which only one action occurs per frame. We propose a simple, yet effective, method for the temporal detection of activities in temporally untrimmed videos with the help of untrimmed classification. Firstly, our model predicts the top k labels for each untrimmed video by analysing global video-level features. Secondly, frame-level binary classification is combined with dynamic programming to generate the temporally trimmed activity proposals. Finally, each proposal is assigned a label based on the global label, and scored with the score of the temporal activity proposal and the global score. Ultimately, we show that untrimmed video classification models can be used as stepping stone for temporal detection.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

CUHK & ETHZ & SIAT Submission to ActivityNet Challenge 2016

This paper presents the method that underlies our submission to the untrimmed video classification task of ActivityNet Challenge 2016. We follow the basic pipeline of very deep two-stream CNN [16] and further raise the performance via a number of other techniques. Specifically, we use the latest deep model architecture, e.g. ResNet and Inception V3 and introduce a new aggregation scheme (top-k ...

متن کامل

ActivityNet Challenge 2017 Summary

Most of traditional video action recognition methods are based on trimmed videos, which is only one action in one video. But most of videos in real world is untrimmed. In order to overcome the difficulty in some extent, we propose a method based on fusion of multiple features for untrimmed video classification task of ActivityNet challenge 2017. We use the CNN features, MBH features and stacked...

متن کامل

Temporal Activity Detection in Untrimmed Videos with Recurrent Neural Networks

This work proposes a simple pipeline to classify and temporally localize activities in untrimmed videos. Our system uses features from a 3D Convolutional Neural Network (C3D) as input to train a a recurrent neural network (RNN) that learns to classify video clips of 16 frames. After clip prediction, we post-process the output of the RNN to assign a single activity label to each video, and deter...

متن کامل

UC Merced Submission to the ActivityNet Challenge 2016

This notebook paper describes our system for the untrimmed classification task in the ActivityNet challenge 2016. We investigate multiple state-of-the-art approaches for action recognition in long, untrimmed videos. We exploit hand-crafted motion boundary histogram features as well feature activations from deep networks such as VGG16, GoogLeNet, and C3D. These features are separately fed to lin...

متن کامل

Budget-Aware Activity Detection with A Recurrent Policy Network

In this paper, we address the challenging problem of efficient temporal activity detection in untrimmed long videos. While most recent work has focused and advanced the detection accuracy, the inference time can take seconds to minutes in processing one video, which is computationally prohibitive for many applications with tight runtime constraints. This motivates our proposed budget-aware fram...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1607.01979  شماره 

صفحات  -

تاریخ انتشار 2016